11/03/04 18:19 FAX 732 530 9808 



MOSER PATTERSON SHERIDAN -> PTO 



Don 



09/894,898 



REMARKS 

In view of the following discussion, the Applicant submits that none of the claims 
now pending in the application is anticipated under the provisions of 35 U.S.C. § 102 or 
made obvious under the provisions of 35 U.S.C. § 103. Thus, the Applicant believes 
that all of these claims are now in allowable form. 

I. REJECTION OF CLAIMS 27. 29 AND 3 1 UNDER 35 U.S.C. § 112 

The Examiner has rejected claims 27, 29 and 31 under 35 U.S.C. § 112 for 
allegedly failing to comply with the enablement requirement and for allegedly being 
indefinite. In response, the Applicant has amended claims 27, 29 and 31 in order to 
more clearly recite aspects of the present invention. 

Specifically, claims 27. 29 and 31 have been amended to recite a state threshold 
that is "dynamically adjustable" , replacing a state threshold that is "dynamically 
adjusted" . The Applicant submits that a dynamically adjustable threshold for assessing 
a probability of a match between an element of a subgrammar and an input speech 
signal is described in the specification in a manner that Is sufficiently enabling and 
definite (see, for example, paragraph [00271: "A dynamically adjustable threshold may 
be used to determine the probability of a word match."). Accordingly, the Applicant 
respectfully requests that the rejection of claims 27, 29 and 31 under 35 U.S.C. § 112 
be withdrawn. 

II. REJECTION OF CLAIMS 1. 2 AND 8-35 UNDER 3 5 U.S.C. S 102 

The Examiner has rejected claims 1, 2 and 8-35 in the Office Action as being 
anticipated by the Brown et al. patent (US patent 5,719,997, issued on February 17, 
1998, hereinafter Brown). The Applicant respectfully traverses the rejection. 

Brown teaches a speech recognition system that uses an evolutional grammar to 
recognize an input speech signal in real time. In particular. Brown teaches that as 
speech recognition processing begins, only a portion of a system grammar (i.e., a 
vocabulary comprising a plurality of interrelated words) is implemented for recognition 
purposes. As more of the speech signal is received by the system and as processing 
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proceeds, additional portions of the grammar network (/.e. f additional words or 
vocabulary) are implemented as necessary. In other words, a single system grammar is 
assembled, piece-by-piece, as the speech signal is received. 

The Examiner's attention is directed to the fact that Brown fails to disclose or 
suggest the novel invention of acquiring or applying both a top-level grammar and one 
or more related subarammars (including, for e xample, a word subgrammar, a phone 
subarammar and a state subarammart . as claimed in Applicant's independent claims 1 , 
11, 18, 34 and 35. Specifically, Applicant's claims 1, 11, 18, 34 and 35 positively recite: 

I. A method for allocating memory In a speech recognition system comprising the 
steps of: 

acquiring a first set of data structures that contain a gram mar, a word 
suborammar. a ohone subarammar and a state subarammar . each of the subgrammars 
related to the grammar, 

acquiring a speech signal; 

performing a probabilistic search using the speech signal as an input, and using 
the grammar and the subgrammars as possible inputs; and 

allocating memory for one of the subarammars when a transition to that 
subgrammar is made during the probabilistic search. (Emphasis added) 

II. In a speech recognition system, a method for recognizing speech comprising the 
steps of: 

acquiring a first set of data structures that contain a gram mar, a word 
subarammar. a phone subarammar and a state subarammar . each of the subgrammars 
related to the grammar; 

acquiring a speech signal; 

performing a probabilistic search using the speech signal as an input, and using 
the grammar and the subgrammars as possible inputs; 

allocating memory for one of the subarammars when a transition to that 
subgrammar is made during the probabilistic search; and 

computing a probability of a match between the speech signal and an element of 
the subgrammar for which memory has been allocated. (Emphasis added) 
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18. In a speech recognition system, a method for recognizing speech comprising the 
steos of* 

acquiring a first set of data structures that contai n a top level grammar and a. 
plurality subarammars . each of the subgrammars hierarchically related to the grammar 
and to each other; 

acquiring a speech signal; ... j .„!„,, 

performing a probabilistic search using the speech signal as an input, and using 
the top level grammar and the subgrammars as possible inputs; 

allocating memory for specific subgrammars when transitions to those specific 
subgrammars are made during the probabilistic search; and 

computing probabilities of matches between the speech signal and elements of 
the subgrammars for which memory has been allocated. (Emphasis added) 



34. A method for allocating memory in a speech recognition system comprising the 

steps of : . „ 

acquiring a set of data structures that conta in a grammar and one or more 

subgrammars related to the grammar ; 
acquiring a speech signal; 

performing a probabilistic search using the speech signal as an input, and using 
the grammar and the subgrammars as possible inputs; and 

allocating memory for a selected one or more of th e subarammars when a 
transition to the selected subgrammar is made during the probabilistic search. 
(Emphasis added) 

35. In a speech recognition system, a method for recognizing speech comprising the 
steps of: 

(a) acquiring a set of data structures that contain a gra mmar and one or more 
subgrammars related to the grammar . 

(b) receiving spoken input; 

(c ) using one or more of the data structures to recognize the spoken input; 

(d) while the speech recognition system is operating, acquiring a second set of 
data structures that contain a second grammar and one or more subgrammars related 
to the second grammar; and . 

(e) repeating steps (b) and (c ). using the second set of data structures in step 

(c). (Emphasis added) 

Applicant's invention is directed to a method for allocating memory in a speech 
recognition system. Conventional speech recognition systems require a great deal of 
memory in order to accommodate and process large vocabularies. These systems 
typically compile, expand, flatten and optimize all grammars contained in a system 
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vocabulary Into a large, single-level data structure that must be stored in memory before 
the speech recognition system can operate. Such techniques substantially restrict the 
capabilities of speech recognition systems that operate on limited memory and 
processing power, such as portable speech recognition systems. 

The present invention provides a method for speech recognition in which 
memory is allocated to a particular system subgrammar when a transition is made to 
that subgrammar during a probabilistic search. A system vocabulary has a hierarchical 
data structure including at least one top-level grammar (e.g., "Days of the Week") and at 
least one subgrammar within that top-level grammar such as a word subgrammar (e.g., 
Monday, Tuesday. Wednesday, etc.), a phone subgrammar (e.g., Iml, /ah/, /n/, /d/, /ey/, 
etc.) and a state subgrammar (e.g., comprising Hidden Markov Models). When the 
system receives a speech signal for processing, the speech signal Is input, along with 
the (unexpanded) top-level grammar and one or more subgrammars, into a probabilistic 
search. When a transition is made to a particular subgrammar during the probabilistic 
search, memory is allocated to the subgrammar, which may then be expanded and 
evaluated to assess the probability of a match between the speech signal and an 
element in the subgrammar. In this manner, memory is conserved and allocated only to 
portions of the system vocabulary that are currently needed for speech processing. 

In contrast, Brown teaches a method in which successive portions of a single 
grammar are implemented piece-by-piece, e.g., as more of the incoming speech signal 
is received. Thus, Brown fails to anticipate or make obvious Applicant's invention. 

Specifically, Brown only teaches that common-level portions of a grammar are 
gradually implemented. For example, after a first word In the speech signal is 
recognized using a first portion of the grammar, a new portion of the grammar is 
implemented that includes words that could potentially follow the recognized word in a 
valid command. Brown does not teach that the grammar has a hierarchical data 
structure , e.g., including not only a top-level grammar and word subgrammar , but 
corresponding phone and state grammars as well. Nor does Brown teach that system 
memory may be allocated to a subgrammar of a hierarchical data structure. 

The Examiner alleges that a grammar, a word subgrammar, a phone 



li 



PACE 14(18 ■ RCVD AT 11/3/2004 6:09:08 PM [Eastern standard Time] * SVR:USPTO-EFXRF-1/2 " DNIS:K7?9306 * CSID:732 530 9808 • DURATION (mm-ss):07-10 



MOSER PATTERSON SHERIDAN -» PTO 



ID015 



09/894,898 



subgrammar and a state subgrammar, while not explicitly taught by Brown, are 
Inherently acquired by the method taught by Brown. The Applicant respectfully 
disagrees. To establish inherency, the extrinsic evidence 'must make clear that the 
missing descriptive matter is ******** present in the thing described in the reference, 
and that it would be so recognized by person of ordinary skill.'" In re Robertson, Slip Op 
98-1270 (Fed. Clr. February 25. 1999) citing Continental Can Co. v. Monsanto Co., 948 
F.3d 1264. 1268, 20 USPQ2d 1746. 1749 (Fed Clr. 1991) (Emphasis added). 
"Inherency, however, mav not be es tablished bv probabilities or possibilities. The mere 
fact that a certain thing may result for a give set of circumstances is not sufficient." id. 
citing Continental Can Co. v. Monsanto Co., 948 F.3d 1264. 1269, 20 USPQ2d 1746, 
1749 (Fed. Cir. 1 991 ) (Emphasis added). 

Brown makes no mention of implementing data structures that descend as far as 
the sub-word (e.g., phone or state) level, much less of allocating memory for such levels 
of a data structure. At most, Brown teaches recognition at the word level. Thus, a data 
structure that includes a grammar, a word subgrammar, a phone subgrammar and a 
state subgrammar is not necessarily acquired by the method taught by Brown, and 
acquisition of such a data structure cannot be inherently present in Brown's teachings. 
Brown thus fails to teach or make obvious a method of allocating speech recognition 
system memory that acquires a top-level grammar and a plurality of subgrammars 
(including at least a word subgrammar . a phone subgrammar and a state subgrammar) 
and allocates memory to expand a subgrammar to which a transition is made, as 
positively claimed by the Applicant in claims 1. 11. 18, 34 and 35. Therefore, the 
Applicant submits that independent claims 1, 11, 18. 34 and 35 fully satisfy the 
requirements of 35 U.S.C. §102 and are patentable thereunder. 

Dependent claims 2, 8-10. 12-17 and 19-33 depend from claims 1. 11 and 18 
and recite additional features therefore. As such, and for at least the same reasons set 
forth above, the Applicant submits that claims 2. 8-10, 12-17 and 19-33 are not 
anticipated by the teachings of Brown. Therefore, the Applicant submits that dependent 
claims 2. 8-10. 12-17 and 19-33 also fully satisfy the requirements of 35 U.S.C. §102 
and are patentable thereunder. 
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til. REJECTION OF CLAIMS 3-7 and 36 U NDER 35 U.S.C. S 103 

The Examiner rejected claims 3-7 and 36 under 35 U.S.C. §103(a) as being 
unpatentable over Brown In view the Ehsani et al. application (U.S. Publication No. 
2002/0032564, published March 14, 2002. hereinafter Ehsani). The Applicant 
respectfully traverses the rejection. 

Brown has been discussed above. Ehsani teaches a method for creating 
grammar networks for use in natural language voice user interfaces (NLVUIs). Valid 
phrases are extracted from a text corpus and clustered into classes to create a 
"thesaurus" of fixed word combinations that represent different ways of saying the same 
thing. In this way, anticipated user responses can be expanded into alternative 
linguistic variants. 

The Examiner's attention is directed to the fact that Brown and Ehsani (either 
singly or in any permissible combination) fail to disclose or suggest the novel invention 
of acquiring or applying both a top-level grammar and one or more related 
sttbgrammars (including, for example, a word subgrammar, a phone subgrammar and a 
state subgrammar), as claimed in Applicant's independent claim 1, from which claims 3- 
7 depend, and independent claim 36. Independent claim 1 has been recited above. 
Applicant's independent claim 36 positively recites: 

36. In a speech recognition system, a method for recognizing speech comprising the 

St6pS Of I 

(a) acquiring from a first remote computer a set of data structures that contain a 
grammar and one or more subarammars related to the grammar, 

(b) receiving spoken input; 

(c ) using one or more of the data structures to recognize the spoken input; 

(d) while the speech recognition system is operating, acquiring a second set of 
data structures from the first remote computer or from a second remote computer, the 
second set of data structures containing a second grammar and one or more 
subarammars related to the second grammar, and 

(e) repeating steps (b) and (c ), using the second set of data structures in step 
(c). (Emphasis added) 

As recited in the preceding claim, Applicant's invention teaches a method for 
speech recognition in which memory is allocated to a particular system subgrammar 
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(e.g., a word, phone or state subgrammar) when a transition is made to that 
subgrammar during a probabilistic search. Memory allocation allows the subgrammar 
to be expanded and evaluated to assess the probability of a match between an input 
speech signal and an element in the subgrammar. In this manner, memory is 
conserved and allocated only to portions of the system vocabulary that are currently 
needed for speech processing. 

In contrast, neither Brown nor Ehsani teaches or suggests this novel approach. 
Neither Brown nor Ehasni teaches acquiring a hierarchical data structure including a 
top-level grammar and at least one subgrammar (e.g., including sub-word structures 
such as phone or state subgrammars) for the purposes of memory-efficient speech 
recognition processing, as positively claimed by the Applicant in claims 1 and 36. 
Therefore, the Applicant submits that independent claims 1 and 36 fully satisfy the 
requirements of 35 U.S.C. §103 and are patentable thereunder. 

Dependent claims 3-7 depend, either directly or indirectly, from claim 1 and recite 
additional features thereof. As such and for at least the same reasons set forth above, 
the Applicant submits that claims 3-7 are also not made obvious by the teachings of 
Brown in view of Ehsani. Therefore, the Applicant submits that dependent claims 3-7 
also fully satisfy the requirements of 35 U.S.C. § 103 and are patentable thereunder. 

IV. CONCLUSION 

Thus, the Applicant submits that all of the presented claims now fully satisfy the 
requirements of 35 U.S.C. §102 and §103. Consequently, the Applicant believes that all 
of these claims are presently in condition for allowance. Accordingly, both 
reconsideration of this application and its swift passage to issue are earnestly solicited. 

If, however, the Examiner believes that there are any unresolved Issues requiring 
the issuance of a final action in any of the claims now pending in the application, it is 
requested that the Examiner telephone Mr. Kln-Wah Tong, Esq. at (732) 530-9404 so 
that appropriate arrangements can be made for resolving such Issues as expeditiously 
as possible. 
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Respectfully submitted, 

Date Kin-Wah Tohg, Attorney 

Reg. No. 39,400 
(732) 530-9404 

Moser, Patterson & Sheridan, LLP 
595 Shrewsbury Avenue 
Shrewsbury, New Jersey 07702 
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